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We give details of our precise determination of the light quark masses m u d = (m u +md)/2 
and m s in 2 + 1 flavor QCD, with simulated pion masses down to 120 Me V, at five lattice 
spacings, and in large volumes. The details concern the action and algorithm employed, the 
HMC force with HEX smeared clover fermions, the choice of the scale setting procedure 
and of the input masses. After an overview of the simulation parameters, extensive checks 
of algorithmic stability, autocorrelation and (practical) ergodicity are reported. To corrobo- 
rate the good scaling properties of our action, explicit tests of the scaling of hadron masses 
in Nf = 3 QCD are carried out. Details of how we control finite volume effects through 
dedicated finite volume scaling runs are reported. To check consistency with SU(2) Chiral 
Perturbation Theory the behavior of / m u & and F w as a function of m U( j, is investigated. 
Details of how we use the RI/MOM procedure with a separate continuum limit of the run- 
ning of the scalar density are given. This procedure is shown to reproduce the 
known value of rom s in quenched QCD. Input from dispersion theory is used to split our 
value of m uc i into separate values of m u and m^. Finally, our procedure to quantify both 
systematic and statistical uncertainties is discussed. 
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Summary 



1 Introduction 



The goal of this paper is to give technical details of our calculation of the average light quark 
mass m ud = (m u + m d ) /2 and of the strange quark mass m s at the physical mass point and in 
the continuum |0Q. This calculation is from first principles and sets new standards in terms of 
controlling all systematic aspects of a direct calculation of quark masses. Because the values 
m u and nid are also of fundamental importance, we determine them by combining our results 
for m u d and m s with dispersive information based on 77 — > 3n decays. A summary of recent 
determinations of light quark masses in Nf = 2 and Nf = 2 + 1 QCD is found in fl}. 

Let us begin by stating the minimum requirements for a first-principles determination of 
m ud and m s with fully controlled systematics: 

1. The action should belong to the universality class of QCD according to standard argu- 
ments, based on locality and unitarity, and an exact algorithm should be used. 

2. The light quark masses should be sufficiently close to their physical values such that an 
extrapolation, if necessary, can be performed without adding non-trivial assumptions. Our 
simulations are performed "a? the physical mass point", i.e. with values of M n and M K 
which bracket the physical values; this eliminates the need for a "chiral extrapolation". 

3. Simulations should be performed in volumes large enough to ensure that finite- volume 
effects are well under control (we use box sizes up to L ~ 6 fm). 

4. Simulations should be performed at no less than three lattice spacings a to make sure that 
a controlled extrapolation of all results to the continuum, a — > 0, can be performed. 

5. All renormalizations should be performed nonperturbatively, and the final result should 
be given in a scheme which is well-defined beyond perturbation theory (we will give our 
results in the RI/MOM scheme). 

6. The scale and other input masses should be set by quantities whose relation to experiment 
are direct and transparent (we use the masses of the Cl baryon, the pion and the kaon). 

The present work contains additional innovative features which are not required to give an 
ab-initio result, but help to keep all systematic errors small: 

7. We devise a method, tailored to needs of studies with Wilson-like fermions, to reconstruct 
the renormalized quark masses m udi and m s from the much simpler renormalization of the 
quantities m a jm udi and m s — m ud . We call it the "ratio-difference method". 

8. We propose an approach which overcomes the RI/MOM "window-problem". It is based 
on taking a separate continuum limit of the evolution function Rs(p, [!>') of the scalar 
density S from a scale p ~ 2 GeV, where the RI/MOM procedure yields reliable results, 
to a scale p' ~ 4 GeV where one may make (controlled) contact with perturbation theory. 

9. We use an advanced analysis procedure to assess the size of both the statistical and the 
systematic uncertainties (the same one as in [2]). 
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Figure 1 : Summary of our simulation points. The pion masses and the spatial sizes of the 
lattices are shown for our five lattice spacings. The percentage labels indicate regions, in which 
the expected finite volume effect £H/ on M w is larger than 1%, 0.3% and 0.1%, respectively. In 
our runs this effect is smaller than about 0.5%, but we still correct for this tiny effect. 



In our view, item 2 marks the beginning of a new era in numerical lattice QCD, because it 
avoids an extrapolation in quark masses which, generically, requires strong assumptions, thus 
relinquishing the first-principles approach (see the discussion in HI). 

To give the reader an overview of where we are in terms of simulated pion masses M v and 
spatial box sizes L, a graphical survey of (some of) our simulation points is provided in Fig.Q] 
(with more details given in Sec. 5). We have data at 5 lattice spacings in the range 0.054 — 
0.116 fm, with pion masses down to ~ 120 MeV and box sizes up to ~ 6 fm. Comparison with 
Chiral Perturbation suggests that our finite volume effects are typically below 0.5%, and close 
to the physical mass point (which is the most relevant part) even smaller. Still, we correct for 
them by means of Chiral Perturbation Theory Q, and test the correctness of this prediction 
through explicit finite volume scaling runs (see below). 

The remainder of this paper is organized as follows. In Sec. 2 details are given concerning 
the action and algorithm employed, while Sec. 3 specifies how one determines the HMC force 
with HEX smeared clover fermions. Our choice of the scale setting procedure and of the in- 
put masses is discussed in Sec. 4, with simulation parameters tabulated in Sec. 5. Checks of 
algorithmic stability are summarized in Sec. 6, while autocorrelation and (practical) ergodicity 
issues are reported in Sec. 7. To corroborate the good scaling properties of our action, explicit 
tests of the scaling of hadron masses in Nf = 3 QCD are carried out, see Sec. 8. Details of how 
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we control finite volume effects through dedicated finite volume scaling runs are reported in 
Sec. 9. To test consistency with SU(2) Chiral Perturbation Theory the behavior of M%/m ud and 
Fj, as a function of m u d is investigated in Sec. 10. Details of how we use the RI/MOM proce- 
dure with a separate continuum limit of the running of the scalar density Rs(fJ>, //) are given in 
Sec. 1 1. To show the reliability of this procedure the known value of rom s in quenched QCD is 
reproduced, see Sec 12. In Sec. 13 it is discussed how one may use input from dispersion theory 
to split our value of m ud into separate values of m u and m d . Sec. 14 specifies our procedure to 
quantify both systematic and statistical uncertainties. Our final result is summarized in Sec. 15. 



2 Action and algorithm details 

We use a tree-level 0( a 2 ) -improved Symanzik gauge action [@]| together with tree-level clover- 
improved Wilson fermions [[5]|, coupled to links which have undergone two levels of HEX 
smearing. The latter derives from the HYP setup [6], but with stout/EXP smearing [7] as the 
effective ingredient - see (8]| for details. In terms of the original [U^(x)] and smeared [V^(x)] 
gauge links (see below) our action takes the form flUEl 



g gSym _|_ cjiSW 

5 Sym = /3 [|^ ReXr(1 _ C/plaq) + ^^ ReXr(l _ C/re ^ 
6 plaq ^ rect 

with o pv — Ih^t, 7i/] and denoting the standard Wilson action, and where the expression for 
the field strength can be found in [5|. In S^ ym only the original thin links U^(x) are used. The 
parameters c , c\ [4J and csw (El are set to their tree-level values 

Cl = -1/T2, co = l-8 Cl = 5/3, csw = l. (2) 

Note that both the hopping part and the clover term in Sj W use the same type of HEX-smeared 
links V^(x) = V^\x). Those are constructed from the thin links V^(x) = U^(x) via (HI 
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V^\x) = exp(^P TA {r^)(x)^- 1 )(a;)t})^- 1 )(x) ee V™{x) (3) 

without summation over repeated indices. Here 

P TA {M} = ~[M-M*] - ^Tr[M-Mt] (4) 

denotes the traceless anti-hermitean part of the 3x3 matrix M. We use the parameters oi\ = 0.95, 
a 2 = 0.76 and « 3 = 0.38. In terms of the standard stout/EXP smearing convention 

rpH*) = E v}^ 1 \x)v^- 1 \x+u)vj n - 1 \x+^ 

V< n \x) = exp(pP TA {T^(x)V^ 1 \xy})v^- l \x) (5) 

the values of a«, i = 1,2, 3, above correspond to p=0. 158333, 0.19 and 0.19, respectively. This 
smearing differs from the one we used in [|2l[9]| in that the fermions interact even more locally 
with the gauge fields here (cf. the discussion in the supplementary online material of and the 
appendix of [H). Note, finally, that our action is exactly as ultralocal (in position space) as the 
original Wilson/clover action, since D(x, y) =0 for \x— y \ >1. 

As we use the hybrid Monte-Carlo (HMC) algorithm [10], a non-trivial ingredient with this 
action is the coding of the molecular dynamics (MD) force, which will be discussed in the next 
section. The MD updates are performed in quadruple precision, to ensure exact reversibility 
in our target (double precision) accuracy. Further particulars of our implementation - even- 
odd preconditioning [TTM . multiple time-scale integration ("Sexton-Weingarten scheme") lfT2l . 
mass preconditioning ("Hasenbusch trick") [fT3l . Omelyan integrator [14J, RHMC acceleration 
with multiple pseudofermions |[T5l and mixed-precision solver [9| - have been described in [|9]|. 
As has been noted in the literature [fT6lfT71 , combining several of these ingredients yields a 
dramatic reduction of the critical slowing down that has traditionally been observed for light 
quark masses. As we show in this paper, the thorough combination of all these ingredients 
allows for simulations directly at the physical mass point, in large volumes, with several lattice 
spacings, such that a controlled extrapolation to the continuum can be performed. 

3 HMC force with smeared links 

We use two steps of HEX smearing [8 ] in our fermion action Sf, both for the Wilson and for 
the clover term. Our Sf depends on the thin (unsmeared) links only through the smeared links 

Sf = s f (V {2) (V w (V {0) = U))) (6) 

where denotes the HEX smeared links, with n indicating the smearing level. Generically, 
the fermionic contribution to the HMC force is given by the gauge derivative SSf/8U. In order 
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to obtain the derivative of Sf with respect to U for our two-step smearing recipe, we will apply 
the chain rule twice, which leads us to the following scheme. First one calculates 



^ (7) 

which encodes the details of how the fermions are coupled to the smeared gauge fields. This 
part of the calculation is not related to the smearing, one just takes 8Sf[U]/6U and replaces 
U — > . The main consequence of the nested dependence © is the recursion formula 

SS, SS, 6VW 

* F777— n IP) 



where the proper definition of the star-product and of the second term will be given below. 
Thinking in terms of routines of the computer code, one such step takes the previous derivative 
5S,/SV^ and the links V^ n \ as input and yields the next derivative SSj/SV^l This 

procedure needs to be called n times, at the end we obtain the final fermion force 

5S f SS f 

— - = — (9) 

Since the extension to a second and possibly more steps is straightforward, we will only consider 
the derivation for one level of HEX smearing in the following. 

We now specify the two main ingredients in the derivation of the HMC force for fat-link 
actions, that is the gauge derivative and the pertinent chain rule. Since an SU (3) matrix in 
the fundamental representation is a complex 3x3 matrix with only 8 independent degrees of 
freedom, it is a structured matrix and the derivative has to be defined properly. 

The Lie algebra. For U G SU(3) an infinitesimal change can be written as 

U^U' = exp(Y, u AT A )u (10) 

A 

with real parameters ua and the set {T^|A = 1 . . . 8} forming a basis in the space of the 
traceless, anti-hermitian matrices, i.e. of the tangent space of the group. These matrices are 
normalized through Tt(TaT b ) = —Sab- Using the trace, one can define a scalar product on this 
vector space; for X = Y,a %aTa and Y = Y.A UaTa in the Lie algebra the scalar product 

(X,Y) = -Tr(XY) = J2xAVA (11) 

A 

allows one to build a projector which restricts an arbitrary 3x3 matrix, M, to the tangent space 
P TA (M) = - Ea T A Re Tr(T A M), with P TA (M) defined in ©. Furthermore, it is easy to show 
that for arbitrary matrices M and iV 

Re Tt(P ta (M)N) = Re Tr(P TA (N)M) . (12) 
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Complex valued functions. Let / be a complex valued function on the group SU(3). Its 
derivative with respect to the group element is a vector in the Lie algebra 



6U 



A 



8J_ 

5U 



(13) 



where the components are defined as 



5U 



J A 



du A 



lim 



f(exp(u A T A )U) - f(U) /u A = Tr(T A U 



df 
dU T 



(14) 



Throughout, the partial derivative with respect to U under the trace is to be understood as a 
derivative with respect to unconstrained matrix elements. In particular, this means that 



dU, 



cd 



dU, 



caVbd 



(15) 



ub 



If / depends on the adjoint matrix U\ then using the identity W = U 1 this dependence is 
converted into a dependence on U, with the consequence that 



dUl 



cd 



dU, 



(lb 



u ca u bd ■ 



(16) 



Real valued functions. For a real valued function / the group derivative takes the form 



5f - VT 



5U 



J2T A Tr(T A U 

A 



df 
dU T 



(17) 



SU(3) valued functions. Let V G SU{3) be a function of U, such that V : SU{3) -)■ SU(3) 
is a mapping on the group space. If U changes as in (flOl) . with small parameters u A = 0(e), 
then V will also undergo a small change, which may be written as 



V^V' = V(U') = exp(^2v B T B )V 



(18) 



where the small parameters v b = 0(e) are real- valued functions of the original parameters u A . 
Below we shall need the u A derivative of V, that is 



dV d(EBV B (u)T B ) 



du A du A 
which, upon taking the limit u A — > 0, implies 



V 



(19) 
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dv B 



J A 



B du A 
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(20) 



M=0 



Chain rule. Let the function / depend on U only via V, and let us calculate its deriva- 
tive with respect to U. Again, U transforms with infinitesimal parameters u A , resulting in an 
infinitesimal change of the variables vb = vb{u) of V. The usual chain rule yields 



df_ 

du A 



df 8vb 
g dv B du A 



(21) 



and, after taking the limit u A — > 0, we arrive at 



5U 



E 

B 



5V 



dv B 



du. 



(22) 



u=0 



With (fTTI) and the definition of the gauge derivative from ([13 



, this may be rewritten as 
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_su_ 
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_5U _ 



(23) 



(24) 



This formula is the chain rule for the gauge derivative, which can formally be stated as 

6f__ 6f_ 6V_ 
6U ~ 5V* 6U ' 

With these ingredients we can now specify the HMC force for a fermion action with one 
step of HEX smearing. In the following we will simplify our notation by replacing V^ ,n ^ with 
V®. One HEX smearing, — > V* 3 \ is built from three substeps (cf. Eq.[3]) 



x 



exp (P TA [C^(x)Ul(x)}) ■ U,(x) 
P TA [C^\x)UUx)}) ■ U,(x) 



exp 



(25) 



where P TA has been defined in © and C^ 2 \ represent staples constructed via 



E U a (x)U^x+a)Ul(x+ti) 



eft 



X 



~6~ 



E VW(x)V$(x+v)V™{x+ii) 



(26) 



with the factors ai/(2(d— 1)) included (for reasons to become obvious below). In the following 
we will drop Lorentz indices and space-time arguments for simplicity. 
The task is to calculate the HMC force [with V = V^ 3) ] 



6S f 5S f 5V 



f 



5U 



SV 5U 
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(27) 



where 5Sf/5V is already known. The A component of the star product reads [see (1231) 1 



5^ 
SU 



5S f 



5V® 



ah 



5U 



qt/(3) 

-^(T A U dVb " 



(28) 



where contains the part of the force which is already known 

E(3 ) = v m 6S f (v®) 

Since St is a real- valued function we can write this in the compact form 



(29) 



5S 



>ev> 



(3) 



p (r T ^(3) UV ba 



(30) 



To improve readability, let us temporarily denote by V. The last substep is then V 
exp(A)U, where A = P TA (CW) and thus 



5S f _ P (tjy dV ^ 

- r TA \ u ^ab 



5U 



du T 



p TA (ui:eMA))+P TA (uj: J e J^ )bc u c , 



(3D 



The derivative of the exponential of a traceless anti-hermitian A reads [see Eq. (68) of Q] 

d(exp(A)) = Tr(A ■ dA)B l + Tr(A 2 ■ dA)B 2 + fad A + f 2 A ■ dA + f 2 dA ■ A (32) 

with £>i 2 being second-order polynomials of A and fi i2 complex constants which depend on 
the trace and determinant of A. Using the cyclicity of the trace in the color indices, we arrive at 



P TA (UE 



dV, 



ba 



J ab 



dU T 



P TA C/Sexp(A) + 



.dA, 



Pta(U-^0 ■ [TriUZBjA + Tr(UXB 2 )A 2 + frlTZ + f 2 UXA + f 2 AUT] ba ) (33) 
where the second term contains the derivative of A — P TA (CW). We use the identity 

(34) 



8U T 



PT A (u^P TA (N) ba 



valid for arbitrary M and N [see (fT2l) 1 to shuffle the projector in A = P TA (CU^) to the matrix 
in square brackets in (1331) . Next, we use that the derivative of CW can be written as 



U 



(H ( V> )ab P TA [...] ba = U^(U^P TA [...]) ba - P TA [...] ■ CU^ 



(35) 



dU T VAl-Soa - dJjT 

and introduce a convenient notation for the expression in square brackets in ([331 by means of 



TA 



Tr(UEBx)A + Tr (U"EB 2 ) A 2 + f-JJH + f 2 UEA + f 2 AUE 



(36) 
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With this at hand, and using the relation exp(A) = VU\ we arrive at the compact expression 

P TA (f/£ ab ^|) = P TA (u(ZV-ZC)rf + U^0Z U ) ■ (37) 

Finally, we reintroduce the superscript (3) and note that the U dependence of C^> comes 
exclusively from V^ 2 \ With these adjustments, relation (1371) takes the form 

,f)T/( 3 ) f)T/( 2 ) 

M™$Wr) = M U P® V ® - + P^T-^r) (38) 

where Y,^ is defined as 

(2) _ dC cd ( 3 ) 
^afe - 77^*= (^ 
u 'ba 

meaning that the term S^ 2 ^ can be calculated in the similar way as S^ 3 ^. This procedure can be 
iterated, and we find for an action with one step of HEX smearing 



6^ 



]T P TA {u(Y (i) V^ - Z®C®)rf) + P TA (f/£ (0) ) (40) 



where £W is defined as 



The object dC^ +v> / dV^ is a straightforward staple derivative, where some care needs to be 
taken w.r.t. the Lorentz indices. With this formula, one may implement a routine which calcu- 
lates the HMC force for a fermion action with one step of HEX smearing. The extension to a 
second smearing step is realized through a second call to this routine as shown in ©. 

This calculation of the HMC force with HEX smeared fermion actions extends the results 
of JTllISll. An early treatise of the HMC force for fat-link fermion actions is found in lTT9ll . 



4 Scale setting and input masses 

To set the scale and to adjust the light and strange quark masses m ud = (m u + m d ) /2 and m s 
to their physical values, we need to identify three quantities which can be precisely computed 
on the lattice and measured in experiment. We will use the mass of the baryon and of the 
pseudoscalar mesons ir, K for this purpose, in the latter case with a small correction for isospin- 
breaking and electromagnetic effects (see below). 

In other words, at the point where M^/Mq and M k /Mq agree with the physical values of 
these ratios, the measured value of aM^ is identified with the lattice spacing times 1.672 GeV 
GUI , and this yields a -1 . In 10 we used also the H baryon to set the scale. As discussed there, 
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correlation functions of spin 3/2 baryons are somewhat noisier than those of spin 1/2 baryons. 
On the other hand, the more light valence quarks present in a baryon, the larger the fluctuations. 
In with M T down to 190 MeV these two effects balanced each other, rendering the and 
the E equally good choices. In the present paper we go down to M n = 120 MeV, and the f2 with 
no light valence quark is the better choice. We use the standard mass-independent scale-setting 
scheme, in which this lattice spacing is subsequently attributed to all ensembles with the same 
coupling /? = 6 / (?q and Nf. 

Our ensembles bracket the physical quark masses m U( i and m s in the sense that the span of 
simulated M 2 and M 2 S = 2Mj c —M 2 encompass the physical values given below. As a result, it 
suffices to use a parametrization of clMq as a function of (aM n ) 2 and (aM sS ) 2 which describes 
the entire data set. We find that the Taylor ansatz aM n = co+Ci(aM 7T ) 2 +c 2 (aM sS ) 2 +c 3 (aM sS ) 4: 
works perfectly. 

Since our lattice simulations are performed in the isospin symmetric limit = m u and do 
not account for electromagnetic interactions, the physical input meson masses must be corrected 
for these effects. The account of this as given by FLAG IfSTTl is essentially a refined version of 
the analysis presented by MILC [22] some time ago. The bottom line is that in the framework 
mentioned above one should use the input masses M w = 134.8(3) MeV, M K = 494.2(5) MeV, 
which means that the electromagnetically corrected, isospin-averaged pseudoscalar input meson 
masses essentially agree with the PDG values of M n o and ^(Mj c+ + M^ ), respectively. 

5 Simulation parameters 

An overview of our Nf = 2 + 1 QCD simulations is presented in Tab.Q] For each ensemble 
we indicate the bare parameters, the lattice geometry, and the ensemble length in r = 1 units 
(after thermalization). In addition, the pion mass for the given parameters (determined with a 
specific choice of the fitting interval) is given. Note that the quoted error is only statistical - 
a detailed account of our procedure to keep track of the systematic uncertainties is given in 
Sec. 14. With Wilson fermions negative bare masses are not disturbing; after renormalization 
they will evaluate to positive quark masses (see Sec. 11). We work with spatial volumes as large 
as L 3 ~ (6 fm) 3 and temporal extents up to T ~ 8 fm. Besides reducing finite- volume corrections 
and excited-state contaminations, large (four-dimensional) volumes tend to decrease statistical 
uncertainties to the same extent as increasing the number of trajectories (in a fixed volume) 
would do. For instance, in a L 4 box 1300 trajectories at M 7r L = 4 are approximately equivalent 
to 4100 trajectories at M n L = 3. With an HMC-type algorithm, the effort (at fixed pion mass) 
grows like L 5 . Nevertheless, in view of the increased algorithmic stability (see below), choosing 
large four-dimensional volumes is a beneficial strategy. 

The integrated autocorrelation times of the smeared plaquette and of the number of conju- 
gate gradient iterations in the HMC accept/reject step are at most O(10) trajectories. Depending 
on this autocorrelation time, the gauge field after every fifth or every tenth trajectory is stored 
as a configuration to be used for calculating hadronic observables. 
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-0.09933 


-0.0400 


48 3 x 48 


1240 


0.0804(13) 


3.94 




-0.09000 


-0.0440 


24 3 x 64 


1065 


0.2024(10) 


4.86 




-0.09300 


-0.0440 


32 3 x 64 


1030 


0.1717(08) 


5.50 




-0.09530 


-0.0440 


32 3 x 64 


1300 


0.1457(09) 


4.66 




-0.04800 


-0.0023 


32 3 x 64 


1500 


0.1354(06) 


4.33 




-0.02500 


-0.0060 


16 3 x 32 


12000 


0.2898(07) 


4.62 




-0.03100 


-0.0060 


24 3 x 48 


3000 


0.2535(05) 


6.07 




-0.03600 


-0.0060 


24 3 x 48 


1800 


0.2250(08) 


5.41 




-0.04100 


-0.0060 


24 3 x 48 


4000 


0.1921(05) 


4.61 




-0.04370 


-0.0060 


24 3 x 48 


3900 


0.1725(04) 


4.13 


-0.04900 


-0.0060 


32 3 x 64 


1100 


0.1212(08) 


3.90 




-0.05294 


-0.0060 


64 3 x 64 


1100 


0.0613(06) 


3.92 




-0.04100 


-0.0120 


24 3 x 64 


1020 


0.1884(08) 


4.52 




-0.04630 


-0.0120 


32 3 x 64 


1065 


0.1445(06) 


4.62 




-0.04900 


-0.0120 


32 3 x 64 


1000 


0.1174(06) 


3.76 




-0.05150 


-0.0120 


48 3 x 64 


1200 


0.0835(07) 


4.01 




-0.02000 


0.0045 


32 3 x 48 


2100 


0.1993(3) 


6.36 




-0.02800 


0.0045 


32 3 x 48 


3910 


0.1488(4) 


4.75 




-0.03000 


0.0045 


32 3 x 48 


2000 


0.1322(4) 


4.24 




-0.03121 


0.0045 


32 3 x 48 


2200 


0.1211(2) 


3.87 


3.61 


-0.03300 


0.0045 


48 3 x 48 


2100 


0.1026(4) 


4.93 




-0.03440 


0.0045 


48 3 x 48 


1100 


0.0864(4) 


4.15 




-0.03650 


-0.0030 


64 3 x 72 


1004 


0.0468(5) 


3.00 




-0.02000 


-0.0042 


32 3 x 48 


1750 


0.1969(4) 


6.30 




-0.03000 


-0.0042 


32 3 x 48 


1450 


0.1297(5) 


4.17 




-0.00500 


0.0500 


32 3 x 64 


1000 


0.2227(04) 


7.13 




-0.01500 


0.0500 


32 3 x 64 


1170 


0.1711(03) 


5.48 




-0.02080 


0.0010 


32 3 x 64 


1150 


0.1251(04) 


4.00 




-0.01500 


0.0000 


32 3 x 64 


1115 


0.1644(04) 


5.26 


3.7 


-0.02080 


0.0000 


32 3 x 64 


1030 


0.1245(06) 


3.98 




-0.02540 


0.0000 


48 3 x 64 


1420 


0.0821(03) 


3.94 




-0.02700 


0.0000 


64 3 x 64 


1045 


0.0603(03) 


3.86 




-0.02080 


-0.0050 


32 3 x 64 


1405 


0.1249(04) 


4.00 




-0.02540 


-0.0050 


48 3 x 64 


1320 


0.0806(03) 


3.87 



... to be continued on next page ... 
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-0.01400 


0.0300 


32 3 


x 64 


1325 


0.1242(5) 


3.97 


-0.01900 


0.0300 


48 3 


x 64 


1045 


0.0830(4) 


3.99 


-0.00900 
-0.01400 


0.0000 


32 3 


x 64 


2280 


0.1523(4) 


4.87 


0.0000 


32 3 


x 64 


1055 


0.1249(5) 


4.00 


-0.01900 


0.0000 


48 3 


x 64 


1080 


0.0836(4) 


4.01 


-0.02100 


0.0000 


64 3 x 144 


1200 


0.0598(2) 


3.83 



Table 1: Overview of our Nf = 2 + 1 simulations. The scales at (5 = 3.31,3.5,3.61,3.7,3.8 are 
a' 1 = 1.697(6), 2.131(13), 2.561(26), 3.026(27), 3.662(35) GeV, respectively. Accordingly, 
the minimum pion mass per coupling is M n = 136(2), 131(2), 120(2), 182(2), 219(2) MeV. 

We put sources for the correlation functions on up to eight timeslices. For the precise form 
of the meson and baryon interpolating operators see e.g. 11231 . To reduce the relative weight 
of excited states in correlation functions Gaussian sources and sinks are used, with a radius of 
about 0.32 fm, which was found to be a good choice . 

6 Algorithm stability 

To detect potential instabilities of the HMC algorithm, different stability tests need to be per- 
formed. A rather complete battery of such tests was described in [9J. The pion masses used in 
this work are considerably smaller than those encountered in [|2l|9]|. For this reason, we repeat 
the relevant stability tests for the smallest-mass ensemble at each (3, in particular for those with 
the physical pion mass. 

With D the Wilson or clover operator, the spectrum of A = D^D has no strict lower bound, 
i.e. the operator A is positive semi-definite, but not positive definite. If one could integrate 
the HMC trajectories exactly, this would not cause any problem, since an eigenvalue A of A 
approaching zero would introduce an unbounded back-driving force in the HMC, so that the 
exact zero would be avoided. In practice, the trajectories are generated with a finite step-size 
integrator. Therefore, a very small A m i n in the MD evolution may experience a smaller back- 
driving force than it would in an exact evolution scheme, and this may trigger an instability. 

If a particularly small eigenvalue appears during the molecular dynamics (MD) evolution, 
the solver in the MD force calculation will require more iterations to arrive at its target precision. 
More precisely, the inverse of the iteration count iV CG is closely related to the smallest eigen- 
value of A. In a given ensemble, Nqq shows an approximately Gaussian distribution As 
long as its median is away from zero by several standard deviations, the simulation is deemed 
safe [|9ll24l. In Fig. [2] we show the "worst case scenario", i.e. the situation for the smallest quark 
masses in our set of ensembles. As one can see, even for pion masses as small as 120—135 MeV 
the inverse iteration count and hence the spectrum is bounded away from zero. 

Alternatively, one can monitor the magnitude of the various contributions to the MD force 
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in the MD evolution. This is done in Fig.[3l where the maximum (over space-time) of each 
individual contribution to the total MD force during the MD integration is shown over a period 
of 10, 000 MD steps. As usual, the maximum of each contribution to the MD force fluctuates. 
However, it is important that the magnitude of these fluctuations is not too large. The more 
frequent spikes with large magnitude occur at any given MD step-size, the lower is the HMC 
acceptance rate, to the point where the algorithm becomes unstable. For small pion masses and 
coarse lattice spacings, the situation becomes even worse. This is why we show the ensemble 
with our smallest pion mass (around the physical pion mass) at the coarsest lattice spacing in 
Fig.[3l As one can see there are no dangerously large fluctuations present. 

Finally, it is good practice to monitor the violation AEmd of the MD energy conservation. 
In Fig. H] we show again the "worst case scenario", that is the ensemble with our smallest pion 
mass at the coarsest lattice spacing. As one can see, the typical AE MD in this simulation is small. 
For most of our simulations the acceptance rate is above 90%. Since the acceptance probability 
is given by p acc = min(l, exp(— AE MD )), it is reasonable to use the data accumulated in the 
monitoring of the MD energy violation to check that (exp(— AE MD )) = 1 within errors. 

In summary, because (a) our algorithm is free of dangerous fluctuations of the clover eigen- 
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Figure 2: Histogram of the inverse iteration count (Nqq of the lightest pseudofermion) in the 
ensemble with the lightest quark mass per (3. There is no danger of a tail stretching out to zero. 
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Figure 3: Evolution of the maximum of each MD force during the MD integration. 256 steps 
correspond to one r = 1 trajectory. Shown is one production stream of our "physical pion mass" 
ensemble at (3 = 3.31 . The other streams with the same parameters give a similar picture. 




250 



Figure 4: Evolution of the energy violation over 250 units of MD time in the same simulation 
as in Fig.\3\ All other simulations show similar or smaller energy violations. 
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Topological charge (3=3.8, m ud =-0.02, m s =0 
1000 2000 3000 4000 5000 




1000 2000 3000 4000 5000 



Figure 5: Topological charge history at our finest lattice spacing (f3 = 3.8 corresponds to a 1 ~ 
3.7 GeV) using two vigorous smearings (10 or 30 HYP steps) in the gluonic charge definition. 

value spectrum, (b) there are no dangerous fluctuations in the MD forces and (c) we therefore 
see that large violations of the MD energy conservation are absent in the simulation (resulting 
in high acceptance rates), we conclude that our setup is safe. 

7 Autocorrelation and ergodicity checks 

A known source of concern about HMC simulations in the regime of light quark masses and/or 
small lattice spacings is whether the Markov chain manages to sample configuration space 
sufficiently well, i.e. whether the algorithm is (in practical terms) ergodic Il25l - l27l . 

We monitor two cheap gluonic quantities which are supposed to signal suspicious behavior, 
if there is any. The first one is the plaquette and/or Symanzik gauge action. With the plaquette 
it makes sense to consider smeared varieties, too, i.e. Re Tr where V is a smeared gauge 
link, as described in Sec. 2. We find integrated autocorrelation times of such quantities to be 
at most O(10) in units of unit-length (r = 1) trajectories. The second quantity is the bare 
field-theoretical (global) topological charge q = a 4 /(167r 2 ) Y,x,n<v Tr[F MI/ (x)F M1/ (a;)] where 
Ffj, v (x) is constructed from links which have undergone 10 or 30 steps of HYP smearing J6J. 
The result for our finest lattice spacing which, according to ||25l - |27l represents the worst-case 
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Figure 6: Scaling of the nucleon and delta mass, at fixed M^/Mp = 0.67, versus aa and a 2 . 



scenario, is shown in Fig. [51 First of all, the 10 HYP and 30 HYP charges are always close to 
each other. Second, binning them with bin boundaries at Z/2 + 1/4 yields a clear abundance 
of integer centered bins over half-integer centered bins, and this effect is more pronounced 
with the more vigorous smearing recipe. Last but not least, the histogram of either charge is 
reasonably symmetric after about 5000 trajectories, and the integrated autocorrelation time is 
O(10) trajectories. Since this is the highest (3 simulated, we see no reason for concerns about 
the (practical) ergodicity of our simulations. 

One should keep in mind that the topological tunneling rate may depend sensitively on the 
details of the action (e.g. whether Wilson, Symanzik or Iwasaki glue is used, whether smeared 
or unsmeared links are used in the fermionic part) and on the algorithm (e.g. the number of time 
scales and the specific choice of Hasenbusch masses). 



8 Nf = 3 scaling test for hadron masses 

Since the link smearing of the 2HEX action used in the present work differs from the smearing 
used in BUHL we decided to repeat the entire scaling test, as presented in [9], in all its detail for 
the new action. 

To this end, we run a number of Nj = 3 simulations at various lattice spacings and various 
M^/Mp ratios. For each (3 we interpolate the (common) octet and decuplet baryon mass, i.e. 
M N /M p and M&/M p , to the point where M^/Mp assumes the value 0.67. The latter value 
coincides with [2(M|- hys ) 2 — (MP hys ) 2 ] 1//2 /M^ hys , hence providing a way to tune to a quark mass 
which roughly corresponds to the physical strange quark mass. The results for M N /M p and 
M^/Mp at this interpolation point are then extrapolated, linearly in aa and a 2 , to the continuum. 
Throughout this report, a = g 2 /(47r) denotes the strong coupling constant. For g 2 we use 
the perturbative values, at our lattice spacings, at 4-loop order (see below). The result of this 
procedure is shown in Fig. [6] Three points are worth emphasizing. First of all, the data are 
consistent with either scaling hypothesis over a large range of lattice spacings (out to a ~ 
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Figure 7: Dedicated finite-volume analysis at (3 = 3.31, with ~ 250 MeV (lower set of data) 
and Mjt ~ 300 MeV (upper set). Results are compared to the prediction from Chiral Perturbation 
Theory. The fit to $42\) is shown by solid red curves and the prediction of ChPT Of is the green 
set of dashed curves. The steep dotted lines indicate the boundaries M^L = 3 and M W L = 4. 

0.15 fm), with a slight preference for 0(a 2 ) over 0(ota) scaling, and this suggests that our tree- 
level value of csw (see Sec. 2 for the definition and details) is close to the nonperturbative value 
(which is not known for our action). This finding is in accordance with the results of [8]. Next, 
the continuum extrapolated values shown in Fig.[6]are in perfect agreement with the continuum 
extrapolated baryon masses found in [9] with a different action. Last but not least, the slope 
in either panel of Fig. [6] is smalfl and an action which shows generically a fiat slope in scaling 
quantities is useful for obtaining precise predictions in the continuum. 

In summary we find that both the 6stout action used in BHIUl and the 2HEX action used in 
the present work exhibit small cut-off effects on standard hadron masses over a broad range of 
lattice spacings. 

9 Finite volume corrections 

For a fixed set of bare parameters, (3, m u d, m s , energies and matrix elements of hadronic states 
depend on the spatial size L of the lattice. Typically, the finite volume tends to increase the 

'The deviation of the result on the coarsest lattice from the continuum is 2.0% at most [A with 0(aa) ansatz]. 
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effective mass and to decrease the decay constant, relative to their infinite volume counterparts. 
As a first step it is important to assess by how much such effects would affect the data. For 
the theoretical treatment of these finite-volume effects it makes a difference whether the state 
considered is stable under strong interactions (despite the fact that, in a finite volume, the energy 
spectrum is discrete and all states are stable). The respective frameworks have been established 
by Liischer, both for stable H28U291 and unstable O0U31H states. They allow us to quantify and 
eventually correct for finite-volume effects in a self-consistent manner (i.e. in a way in which 
only the results of our calculations and axioms of quantum field theory are used). 

The structure of these corrections is most transparent for the case of a pion at 1-loop order 
in Chiral Perturbation Theory (ChPT) ll32] - [34l . Up to higher order terms, the relative shift is 



R n (L) = - 1 = const • Ml ■ ~ 9l (M n L) (42) 

where the shape function gi(x) has a well behaved expansion in terms of a Bessel function of 
the second kind, which itself has a large-x expansion of the form 

K x {x) = Q 1/2 exp(-a;)[l + A + ...] (44 ) 

implying that finite volume corrections are exponentially suppressed at large L [28 J. Higher 
loop orders for R n (L) have been worked out in fl3j . For completeness we mention that analytic 
results for finite volume corrections of the nucleon are given in [|35ll36l . The second reference 
predicts that for physical quark masses and L = 5 fm box size (which is what we use in our 
smallest box at the physical mass point, the one at (3 = 3.61, cf. TabfT]) the nucleon experiences 
just a 2 permil finite-size shift. The point is that ChPT predicts the numerical value of the 
coefficient "coeff" in (l42l) . In the chiral literature, the low-energy constants that enter "coeff" 
are pinned down from experiment (at leading order it is F w ). 

To avoid using external input, we decide to stay content with just using the functional form 
of (|42l) . This is permissible, since the shape function g\{x) is just the free Green's function 
of a massive scalar particle, summed over all spatial mirror copies (due to periodic boundary 
conditions in the spatial directions) lf32"l . see also the discussion in [3J. We find that we can 
establish a global fit to all of our data in various volumes if we adjust the free coefficient in (|42|) . 
A similar conclusion is reached for other stable hadrons, albeit with a different numerical value 
of the constant. For the case of the pion, we test the fitting ansatz and the analytic prediction [|3l 
by comparing them to dedicated finite- volume scaling runs, as shown in Fig.[7J Both the fit 
(full line) and the prediction from ChPT [3] (dashed curve) agree with the data. The latter 
prediction has a limited range, since ChPT becomes questionable in boxes with M n L < 3 J3). It 
is important to emphasize that the data with M W L < 4 in Fig.[7]have been generated to test our 
treatment of finite-volume effects, they do not enter the final analysis. 
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Figure 8: M 2 /m^ AC (left) and F w (right) versus m^ AC (cf. Sec. 1 1) for our 4 lightest ensem- 
bles at (3 = 3.5, at fixed am s = —0.006, which is close to mf ys . A joint fit to the NLO chiral 
ansatz j45[ l46l) yields reasonable values of the low-energy constants. Error bars are statistical. 



These results confirm our rule of thumb that simulations with M W L > 4 and/or L>5 fm 
yield infinite- volume masses within statistical accuracy. An overview of the expected size of 
R Mir in our simulations is given in Fig.Q] In all of these points the mass correction is less than 
about 5 permil, and for points close to M% hys (which dominate our analysis) it is even smaller. 
Nevertheless, we include these (tiny) shifts into our global analysis (cf. Sec. 14). 



10 Chiral behavior of pion mass and decay constant 



To illustrate the quality of our results obtained in lattice QCD calculations with physical or 
larger than physical values of the quark mass m u d = (m u + md) /2, we briefly investigate here 
whether the m u d dependence of the pion mass and decay constant can be described by ChPT 
H37U38H in this range of quark masses. 

To this end we compare our results for M 2 and F w versus m u d at fixed (nearly physical) m s 
(cf. Tab.CD) to the NLO predictions of the SU{2) framework. The latter read Wl 



Ml 



M 



« 1 , , M \ 
1 + 2 Xl ° g( Ap 



l-xlog( A2 



M 2 , 
A1 



(45) 
(46) 



with x = M 2 / (47rF) 2 and M 2 = 2Bm u d a shorthand expression for the light quark mass (up 
to the factor 2B, with B = S/F 2 ). The NNLO expressions can be found in [39]|. In all of 
these expressions F, E, B refer to the pion decay constant, the absolute value of the quark 
condensate and the condensate parameter in the 2-flavor chiral limit m u d — > with m s held 
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fixed. The terms in the square brackets proportional to x , x represent the tree-level and 1-loop 
contributions respectively, and A 3 , A 4 encode low-energy constants (LECs). 

In Fig.[8]the quantities M% / m^ AC and F n are plotted versus the PCAC quark mass m^ AC 
(see below) at an intermediate lattice spacing ((3 = 3.5), where we reach down to M n ~ 130 MeV 
(cf. Tab.CD). We find that our results with M n < 400 MeV can be jointly fitted with the NLO 
chiral formulae (|45l |46|) . with acceptable x 2 /d.o.f. and reasonable values of the low-energy 
constants. However, the extraction of Gasser-Leutwyler coefficients is beyond the scope of this 
paper and will be left for future publications. 



Our primary goal is to determine the average up-down quark mass m u d = (m u + md)/2 and the 
strange quark mass m s at the physical mass point in a "continuum" renormalization scheme, 
such as RI/MOM or RGI, using first-principles lattice computations. 

In a lattice study quark masses and the running coupling have a different status than other 
observables, such as hadron masses and decay constants, since they are input parameters to 
the simulation. Consequently, one has to tune these parameters until the low-energy spectrum 
of the theory agrees with experiment (cf. [|2]| and Sec. 4), before one may read them off from 
the results of the simulation. To turn them into observables, one has to convert them from the 
original cut-off scheme (which is specific to the gluon and quark action combination used) to a 
scheme where the scale /i is not tied to the lattice spacing a. 

The remainder of this section is organized as follows. In 11.1 our "ratio difference method" 
for extracting quark masses in the theory with Wilson fermions is explained, using standard 
terminology for the renormalization and improvement coefficients. It is important to notice that 
in the dynamical theory there is a subtlety in the renormalization pattern, due to quark line dis- 
connected diagrams ||40]442ll . but our "ratio difference method" steers around this complication, 
as explained in 11.2. In subsection 11.3 details of how we determine the flavor non-singlet 
scalar renormalization constant Z s (n) via the Rome-Southampton RI/MOM method ll43l are 
given. In 1 1 .4 it is specified how we control the systematics that arise from the dedicated Nj = 3 
computations needed in the RI/MOM procedure. In 1 1.5 a summary is given. 

11.1 Ratio-difference method in a nutshell 

With Wilson-type fermions there are two options for obtaining the renormalized quark mass. 
On the one hand, one may start from the mass parameter am bare , as present in the Lagrangian, 
and apply both additive and multiplicative renormalization to build the VWI quark mass 



11 RI/MOM renormalization of quark masses 
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(47) 
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and VWI mean^l "vector Ward identity". Here Zs = Zg(fj) denotes the lattice-to-continuum 
renormalization factor of the scalar density (it depends on the chosen scheme and scale, e.g. MS 
and \l — 2 GeV), bs is an improvement coefficient (see below), and m cnt specifies the additive 
mass renormalization, i.e. the bare quark mass at which the pion becomes massless. 

The other way to obtain the renormalized quark mass is as follows. For a pseudoscalar 
meson made out of valence quarks tp 1 , tp 2 with Lagrangian masses m^ are or Wilson masses rnf 
(i — 1, 2), respectively, the sum of the (unrenormalized) PCAC quark masses is defined as 

™PCAC , pcac _ Ef(^[^W + ac A d tl P(x)]0(Q)) 

1 + 2 " E*(P(x)O(0)) m 

where and P denote the axial current and the pseudoscalar density, respectively, O represents 
an arbitrary operator which couples to the meson (usually O = P to maximize the signal), and 
<9 M = (d fl + d*)/2 is the symmetric derivative with (d fl (j))(x) = (4>(x + afi) — <p(x)) / (2a) . Then 
only a multiplicative renormalization is needed to form the (renormalized) "AWI quark mass" 

awi _ 1 + b A am w + Q(a 2 ) PCAC 

~ Z P l + b P am™ + 0(ai) 1 j 

where AWI stands for "axial Ward identity". Here Z A and Z P = Z P (fi) denote the lattice-to- 
continuum renormalization factor of the axial current and the pseudoscalar density, respectively. 

The coefficients bs, b A , bp, c A in (l47l-l49l) are part of the improvement program. If properly 
set, 0(a 2 ) scaling of phenomenological quantities can be achieved, but they may be set to zero 
if one is content with 0(a) scaling. We use (l47l-|49l) with tree-level values of the improvement 
coefficients, that isb s = b A = b P = l and c A = . Formally, our results thus show cut-off effects 
proportional to aa, but in the scaling tests presented above cut-off effects proportional to a 2 
seem to be numerically dominant. At this point we cannot anticipate whether a similar statement 
holds true for renormalized quark masses, and we shall thus consider both possibilities (i.e. 
leading cut-off effects proportional to aa or a 2 ). In any case the difference (in a given scheme, 
at a given /i) scales away with a — > 0, hence m AWI = m VWI in the continuum. 

In principle, mf^ and m^ ys may be determined using either definition of the quark mass, 
but in practice it proves beneficial to combine the specific advantages of the VWI and AWI 
approaches. Let us assume, for a moment, that we were to set all improvement coefficients to 
zero. Since the Lagrangian quark mass m bare is an exact input quantity which, after a universal 
shift has been applied, multiplicatively renormalizes with the unproblematic scalar density [cf. 
(|47T)1. it is natural to use m w for quark mass differences, where the additive renormalization 
term m cnt drops out. On the other hand, the PCAC quark mass m PCAC is perfectly suited to 
measure quark mass ratios, since in the ratio the multiplicative renormalization constants cancel 
[cf. (|49l)l. It is thus natural to measure the difference m s — m u d via the Wilson or Lagrangian 



2 Strictly speaking the vector Ward identity constrains only quark mass differences. Below we will use m VWI 
only in such differences, and by doing so the dependence on to" 1 * will persist only in an 0(a)-suppressed term. 
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mass difference d = arnf — am^ d = amj™ — amj™ and the ratio m s j m u d via the PCAC mass 
ratio r = CAC /m^ Ac . In this case one obtains the renormalized masses through 



Id 1 rd 

scheme _ ± /777? sdleme — CSfV) 

a " L ud — ^scheme ^ ' a " L s ~ ^scheme J 



and we shall refer to this strategy as the "ratio-difference method". The renormalization scheme 
will be specified below. In practice, things are slightly more involved, as we intend to maintain 
tree-level improvement. Setting ca = and 6 a = bp = 1 does not change anything in (l48l 
49b , but having a quadratic term in (|47l) through 65 = 1 means that the difference rnf — m^ 
does no longer coincide with mj^-m^™. In the next subsection we will show that even with 
improvement, the renormalized quark masses are given by ( |50l ) with d — » <i imp , r — >■ r imp , where 
the latter quantities are defined in 



11.2 Ratio-difference method in full QCD with improvement 

In the dynamical theory, the renormalization pattern of quark masses is slightly more involved 
than the familiar equations (|47l |49| ) suggest 114211 . but it turns out that our "ratio difference 
method" gets rid of these complications and the final relation is unchanged. 

We now discuss how the findings of H2l apply to our method, using their notation, ex- 
cept that we do not use a "hat" to denote renormalized quantities, since they will come with a 
superscript "VWI" or "AWI", just as in the previous subsection. Equations (26, 48) of fl42] read 
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v _ vv 
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AWI = AA PCAC 



-b s amf - b s aTr{M) + 0{a 2 )] + ... (51) 
2 J 



1 + (b A -bp)amf + (b A -b P )aTr(M) + O(a') (52) 



3 



where Zj is the flavor non-singlet renormalization constant ( J = S, A, P) , and bj = 1 + (a), 
bj = 0(a 2 ), ca = 0(a) denote improvement coefficients (which now depend on Nf). Finally, 
mY = m^ are — m cnt , with m cnt defined as the Nf = 3 critical mass (i.e. in the unitary direction), 
m PCAc j US { as m d4gj^ an( j e iiip Ses m (|5TJ) denote terms which depend on the quark masses 
only through Tr(M), Tr(M 2 ), Tr 2 (M), where M is the (flavor diagonal) quark mass matrix. 
The new feature of formulas (|5Tl l52l) is the terms proportional to rrij times Tr(M) = J2f m Y- 
These terms make the renormalized quark mass of flavor j depend on all other quark masses, 
too. Evidently, these terms come from quark loops in the functional determinant, and the per- 
turbative expansion of the new improvement coefficients bs, 6a > bp shows that they start out at 
order <?q, which means that they come through two-loop effects (one quark loop and a gluon 
loop which attaches it to the quark line whose renormalization is studied). 

Upon considering the difference of two VWI masses and the ratio of two AWI masses 



VWI VWI 1 /_W W\ 

m] -ml = -^{rrij - m k ) 
Zs 



-b s a(mf+mf) - b s aTi(M) + 0(a 2 )] (53) 
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1 + (b A -bp)a(my-m™) + 0(a 2 ) 



(54) 



the term proportional to aTr(M) disappears from the second relation, and the term proportional 
to bs involves only the sum of the Wilson masses. Applying these formulas to m s and m ud in 
Nf = 2 + 1 QCD, with d = am h s are — am^f e and r = m^ CAC /m^ AC defined as before, one has 
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where we have used d and r only in the leading term, so far. The point is that 
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(55) 
(56) 

(57) 
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ud 



where the approximately equal sign means "up to terms of order 0(a 2 )". Accordingly, we can 
express the difference of the VWI masses and the ratio of the AWI masses through d and r as 



VWI VWI 

am s - am ud 



— d imp 



m 



AWI 



.imp 



rn 



AWI 

ud 



where d imp and r imp are defined as 

d imp = d 



1 - -b s d - b s d + 0(a 2 ) 

2 r — 1 r — 1 



.imp 



l + (b A -b P )d + 0{a 2 ) 



(59) 

(60) 
(61) 



In total this means that one finds am^ 1 ™ 6 and amf heme via (l50l) with d — > <f mp , r — )■ r imp . 

In our analysis, the tree-level improvement strategy makes all subleading terms in the square 
brackets of (|60~1I6"TT) disappear, except for the one proportional to b s (with b s = 1). 



11.3 Determination of the scalar RI/MOM renormalization factor 

Having laid out our overall strategy for obtaining the renormalized quark masses m^ ys (/i), 
m phys (fx) at the physical mass point in a standard scheme at a given scale /i, we now give 
details of how we compute the single renormalization factor needed, Zs(^). We implement 
the nonperturbative Rome-Southampton method which defines the regularization-independent 
(RI/MOM) scheme [|43l . with several practical refinements (see below). In the terminology 
of [|40] - l42l the result is the non-singlet renormalization factor Z^ s (/i). In the RI/MOM scheme 
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Figure 9: The ratio ((t) as defined in (f66l) . The gauge coupling in this Nf = 3 run is f3 = 3.61, 
die quark mass is am = —0.0045. This procedure yields a stable plateau for Z v . 



the running of Zs{\i) is known perturbatively to 4-loop order p4|. However, this is only relevant 
for the conversion to other schemes, e.g. MS at \i = 2 GeV. Our main result, m ud and m s in the 
RI/MOM scheme at \i = 4 GeV is derived without reference to perturbation theory. 

In the RI/MOM scheme, renormalization conditions are defined in Landau gauge and re- 
quire the forward, truncated quark Green's function of an operator to be equal to its tree-level 
value at a Euclidean four-momentum p, whose magnitude is chosen to be the renormalization 
scale. Given a quark bilinear operator 0\ 2 — ^Tipi, where ipi and ip2 are mass -degenerate 
quark fields and T is a Dirac matrix, the relevant Green's function is 



A r (p) = (Sip))- 1 1 f d 4 xd*ye^ (^(y)C^(0)^(x)> \ ^(p))" 1 . 



(62) 



In this equation, S(p) is the propagator of quark flavors 1 and 2. Now, defining a projector P r 
such that tr{Prr} = 1 (the trace is over spin x color), the renormalization condition reads 



Z T (fi) = Zjbi) I T r (p)\ p 2-_ 
where Z^ is the wavefunction renormalization factor and 



(63) 



Trip) = tr{P r Ar(p)} 



(64) 
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Figure 10: Renormalization factors Z^(/i 2 ) as a function of the bosonic momentum squared. 
For each (3 momenta \i < ir/ (2a) are included. The data from the coarser lattices have been 
multiplied by a fi-independent factor to match those at (3 = 3.8. The solid line represents a Fade 
ansatz where the 1-loop anomalous dimension is built in as a constraint. 

To determine Z s from the RI/MOM condition (1631) with T = I, one needs to know Z^. In the 



original publication II43II the procedure was supplemented with a recipe to obtain from the 
momentum dependence of the trace of the inverse propagator. Here we avoid the determination 
of this wave-function renormalization constant all together, by calculating the ratio Z s (fi)/Z v 
via the RI/MOM procedure and by combining it with Zy from the 3 -point function with a 
vector-current insertion. In other words, on each ensemble we compute Zs(/-i)/Zy using 



Zs,0,m((J>) IV (p) 



r s (p) 



(65) 



Zv,P,m 

where the dependence on the coupling and the Nf = 3 quark mass is indicated with subscripts. 
The bosonic momentum definition p v = (2/a) sin(ap I/ /2) is used, and a standard cylinder cut 
around hyperdiagonal momenta is applied ||45ll . In addition, we determine Zy from the ratio 



C(t) 



(p(r/2)v 4 (t)P(Q)> 

(P(T/2)P(0)) 



(66) 



where 



P(t) = £$a75^i)(£,f) , p( t ) = > v A (t) = 'Efa-rM&t) (67) 
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32 3 x64 


-0.04 


4780 


-0.006 


2560 


-0.0045 


4620 


-0.0060 


1010 


0.000 


505 


-0.06 


3320 


-0.010 


3140 


-0.0085 


3680 


-0.0085 


1050 


-0.004 


635 


-0.07 


2420 


-0.012 


2580 


-0.0100 


4140 


-0.0110 


1020 


-0.008 


500 


-0.08 


2500 


-0.020 


2700 


-0.0200 


3140 


-0.0140 


1290 


-0.012 


1030 






-0.035 


1090 


-0.0250 


1230 


-0.0160 


1020 


-0.014 


1000 



Table 2: Bare masses and number of trajectories of our dedicated Nf = 3 simulations for RU 
MOM renormalization. The (3-values are the same as in our phenomenological runs, cf. Tab.0 



and T denotes the temporal extent of the lattice. With tree-level improvement one has 11421 |46l 

Z v ^ m (l + am w ) = [C(t 1 )-C(t 2 )]- 1 for <h < T/2 <t 2 <T (68) 

where by — 1, by — have been used, and Fig.|9]shows the plateau from which we extract Z v ^^ m . 
Combining this factor with the result of (|65l) yields Zs,/3, m (fi), much in the spirit of H47U48I . 

11.4 Controlling the systematics in the RI/MOM procedure 

RI/MOM is a mass independent scheme. Applied to the numerical data for Zf l (fx) this means 
that we have to extrapolate all three flavors to vanishing sea and valence quark mass. For this 
reason, we have generated a series of dedicated Nf = 3 lattices (i.e. with three degenerate 
quarks), where the action S = S^ ym + Sj W and the couplings (3 = 6/g 2 are the same as in the 
phenomenological ensembles. The bare parameters and statistics of these runs are summarized 
in Tab.[2] The specifics of the extrapolation will be discussed below. 

In order to obtain tree-level 0(a)-improved results with Wilson fermions, one has to im- 
prove not only the action, but also the interpolating fields. For standard correlators this has been 
discussed in the previous two subsections. In addition, in the RI/MOM procedure, one has to 
remove an O(a) contact term in the quark propagator [43]. We apply here the trace subtraction 
described in [149T1521 . which has the added benefit of greatly improving the signal to noise ratio. 
This subtraction is implemented by replacing the condition (|63l) by one in which the modified 
propagator S(p) = S(p) — Ti(S(p))/A is used to define the amputated Green's function, where 
the trace is in spinor space. 

In order to reliably extract the renormalization constants and to convert the resulting quark 
masses m RI (/i) to other schemes without loosing precision, several conditions should be met: 

(a) the scale fi at which we take the continuum limit of the RI/MOM renormalized masses 
needs to be substantially below the momentum cutoff of the coarsest lattice n<^2%/a, 

(b) the conversion to a perturbative scheme has to be done at a scale // which is sufficiently 
large, such that perturbation theory is reliable, i.e. at fi'^$> Aqcd, 

(c) the effect of the chiral extrapolation m — > needs to be fully controlled. 
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Figure 1 1 : Ratio of the nonperturbative Zf l to the perturbative prediction at 4-loop level. The 
momentum range shown extends to fi m . 
with the plateau within errors. 



7r/(2a) at /3 = 3.8. For fx > 4 GeV the data agree 



The difficulty to fulfill, in one simulation, the first two conditions is sometimes referred to 
as the "window problem" of the RI/MOM procedure. In the following we show how we can 
simultaneously satisfy all three requirements. 

Ad (a): To renormalize our quark masses and to extrapolate them to the continuum we 
choose a convenient renormalization scale /i = 2.1 GeV. This scale satisfies \i < n /(2a) for all 
our lattices (on the coarsest one this figure is about 2. 7 GeV). When plotting the running of 
Zs,/3(fi) at different (3 on top of each other (see Fig. [10]), one finds that discretization effects are 
below our statistical accuracy in this region, and that the form of the running is almost identical 
for our five (3 values. 

Ad (6): With the procedure described above and by taking the continuum limit we obtain a 
fully nonperturbative determination of the quark masses in the RI/MOM scheme at ii = 2. 1 GeV. 
In principle, we could stop here, quoting this as our main result. However, if one wants to 
convert this result to another scale or another scheme, it is evident from Fig. [12] that doing 
so perturbatively would introduce an uncertainty in the 1 — 2% range. Therefore we use our 
renormalization data to run our quark mass results, nonperturbatively, to the scale a! = 4 GeV, 
where this perturbative uncertainty is in the 0.5% range and hence subdominant. At \j! = 4 GeV 
we still have 3 different (3 values which satisfy the condition is! < n /(2a). More specifically, 
we use our data to extrapolate the ratio Zl^^u) / Zf l p(/j,') to the continuum, with an extremely 
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Figure 12: Ratio of the perturbatively evaluated Zf l (fj,) (top panel) and Z^ s (jj,) (bottom panel) 
at different loop orders. The renormalization group equations have been numerically integrated, 
using l-loop through 4-loop anomalous dimensions. To estimate the remaining uncertainty in 
the 4-loop running, we employ the analytic expression at 4-loop level H44\l . which differs from 
the numerically integrated one by 5-loop effects. In the labels this is called "4-loop/ana". 
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mild effect (as one can see from Fig.[lOl in this interval the three curves lie essentially on top 
of each other), and the resulting ratio provides us with the nonperturbative running of the scalar 
renormalization constant, in the continuum, between fi = 2.1 GeV and // = 4 GeV. Accordingly 

Rf(», 4 GeV = lim s f R (69) 

is the continuum extrapolated ratio which allows us to evolve data from all our lattices, in- 
cluding the coarser ones, to fi' = 4 GeV, where we perform the final continuum extrapolation. 
Through this procedure we obtain fully nonperturbatively renormalized quark masses in the 
RI/MOM scheme at fi' = 4 GeV, which represent our main result. For the reader's convenience 
we also convert them to other schemes. To this end we use 4-loop perturbative running to 
convert to the RGI framework (where we use the conventions of ll53l with bo, do adjusted to 
Nf = 3), and subsequently to the MS scheme (which is perturbatively defined). 

Ad (c): As mentioned above, the RI/MOM scheme is a massless renormalization scheme. 
Since the dedicated Nf = 3 simulations as listed in Tab.[2]use finite quark masses (roughly in 
the range mP hys /3 < m < mf ys ), we have to perform a chiral extrapolation at some point. 
In the procedure described in the previous paragraph, the numerical data for Z^\ m {jji) were 
first extrapolated to the chiral limit to give Based on this the renormalized quark 

masses mW'Qj,) and the ratios Zs t /3(fi')/Zs^(fi) were extrapolated to the continuum, as detailed 
in (a) and (b), respectively. To give a reliable estimate of the systematic uncertainties involved, 
we supplement this procedure with a second one where we interchange the order of limits. 
Technically, this means that one defines an intermediate MOM scheme, which is not a massless 
one, but instead based on a fixed reference quark mass. We use mf^ 1 = 70MeV, since, for 
all (3, this value can be reached by interpolation. In this scheme the renormalized light and 
strange quark masses are determined at the scales /i G {1.3, 2.1} GeV, and extrapolated to the 
continuum. This yields rn^p^ (/V) and ditto for m s . Staying in this massive scheme, these 
quark masses are evolved to the scale fi' = 4 GeV. In this step a fully controlled continuum 
extrapolation can be performed, since we have three lattice spacings satisfying // < n /(2a). At 
this point we have the renormalized quark mass in the form 



ni 



M,,M W^M-Ht (70) 



ud,m I0 f\r 1 ' J ud,m Ic f 



where either factor has been extrapolated to the continuum. In the last step, we switch from the 
intermediate massive MOM scheme to the massless RI/MOM scheme by multiplying (1701 ) with 
the continuum extrapolated ratio Z Sy m vai (fx') /Z s (^'). This yields the same m^(//), m^//) as 
before, except that the order of limits has been interchanged. Note that all continuum extrap- 
olations are entirely flat and the effect of the mass extrapolation is about 1%, implying that all 
limiting procedures are fully controlled. Having obtained our main result, the RI/MOM masses 
at // = 4 GeV, we can transform them to other schemes as described under (b). 
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P L 3 xT 


(m s +m ud )r 


5.7366 12 3 x24 
5.8726 16 3 x32 
5.9956 20 3 x40 
6.1068 24 3 x48 
6.3000 32 3 x 64 


0.3070(50) 
0.2801(50) 
0.2758(52) 
0.2654(42) 
0.2685(29) 



Table 3: Details of the quenched overall test. The quark masses are in the MS scheme at 2 GeV. 
11.5 Summary of RI/MOM renormalization 

Let us summarize this section. We compute the quark masses m^ ys and mf 5 " 5 through the 
"ratio-difference method" in the RI/MOM scheme at the scale // = 4 GeV, nonperturbatively 
and with extrapolation to the continuum. 

The mild quark mass dependence of the renormalization factors is eliminated through a 
chiral extrapolation. Also cut-off effects are removed through a continuum extrapolation. In 
this step we are extremely conservative - we do not only consider the formally leading cut-off 
effects 0(ota), but also subleading effects proportional to 0(a 2 ), counting the spread towards 
the final systematic error (see Sec. 14). We think this is necessary, since even with a set of 
5 lattice spacings, we cannot exclude the possibility that the subleading 0(a 2 ) cut-off effects 
largely affect the continuum extrapolation. If we were to consider only the leading O(aa) 
cut-off effects, our systematic error would be significantly smaller. 

The quark masses in the RI/MOM scheme at the scale /jf = 4 GeV are our main result, 
obtained in a way which guarantees that they are truly nonperturbative. Using perturbation 
theory in a regime where it is well behaved, we convert them to the universal RGI prescription 
and subsequently to the perturbatively defined MS scheme at the scale jx — 2 GeV. 



12 Quenched overall check 

To demonstrate that our 2- step HEX smeared clover action (OQ) and the nonperturbative renor- 
malization of the quark mass yield reliable results in the continuum, we repeat the quenched 
benchmark calculation Il53l of the quantity m s +m ud , using our setup. 

We use pure Wilson glue at five couplings between (3 = 5.7366 and (3 = 6.3, each time saving 
about 600 well-decorrelated configurations for the analysis (i.e. 200 for the Z-factors and 400 
for the masses). The couplings and geometries have been chosen to realize a fixed physical box 
size of about L = 1.84 fm, see Tab.[3]for details. On each set at least 4 quark masses are used 
to safely interpolate to the point M P r = 1.229, where M P is the pseudoscalar meson mass and 
where the numerical value has been chosen to match M^ hys r with r = 0.49 fm 11541 . 

The computation closely follows the dynamical case. We renormalize the VWI quark mass 
sum with the methods described in the previous section, and we use the same procedure to 
convert to the MS scheme. In more detail, we begin with measuring rnF CAC as a function of 
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Figure 13: Quenched continuum extrapolation of (m s +m u d)ro in the MS scheme at fj, = 2 GeV, 
assuming 0(aa) [top] or 0(a 2 ) [bottom] scaling. One data-point outside the scaling regime 
((3 = 5.7366) is shown. The difference counts towards the systematic error (see text for details). 
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the bare mass m c , which shows a linear relationship. The intercept with the x-axis yields 
m crit an( j t ^ us m w as defined in (|47T) . Next, we determine Z s {n) via RI/MOM |43l to obtain 
m VWI according to (|47T) . It is easy to see that this is the flavor non-singlet Zs((jl), since all 
quark disconnected contributions vanish in the quenched theory. In close analogy with our 
phenomenological analysis, we choose the matching scales fi = 2.1 GeV and // = 3.5 GeV. 
Combining the continuum extrapolated ratio Zs(fJ>')/Zs(n) with Zspip), we obtain Zs t p(n') 
and the renormalized mass m VWI (//) in the RI/MOM scheme at the scale // = 3.5 GeV. Finally, 
we use perturbation theory to convert to the MS scheme at 2 GeV scale. The result is identified 
with m s + m ud in this scheme, at the given lattice spacing, and multiplied with r to obtain a 
dimensionless quantity (cf. Tab. [3]). We find that we can extrapolate these values linearly in era 
or a 2 , with the data showing a slight preference for the latter option, as can be seen from the 
two panels in Fig. [[3] Using the machinery for propagating both statistical and systematic errors 
that will be described in Sec. 14, the combined result in the continuum reads (m s + m u d)ro = 
0.2609(39) (28) in the MS scheme at fi = 2 GeV. 

Our result is in perfect agreement with the continuum value (m s + m ud )r = 0.261(9) 
quoted by the ALPHA collaboration [53]. It is consistent, within less than la, with the result 
0.274(18) given by JLQCD [EH and, within less than 2a, with the value 0.312(28) obtained 
in a computation with quenched overlap fermions that includes a continuum extrapolation [|56l . 
There is some tension with the result 0.293(6) by CP-PACS [57J, but one should keep in mind 
that the systematic uncertainty due to the perturbative renormalization is not included in their 
error. In short, we find good agreement with the most precise results in the literature. We take 
this as evidence that the renormalization procedure described in Sec. 1 1 yields reliable results. 

13 Using dispersive input to obtain m u and irid 

For decades the most reliable source of information on light quark masses has been current al- 
gebra, in particular in its modern form, known as Chiral Perturbation Theory (ChPT). A major 
drawback of this framework is that only information on quark mass ratios can be extracted, not 
on absolute values. This is a consequence of the fact that in the chiral Lagrangian all quark 
masses appear in the combination B m g and the condensate parameter B does not occur in 
any other instance. We have determined m u d = (m u +md)/2 and m s in MeV units. Accord- 
ingly, by comparing our value of the ratio m s /m u d to theirs, we can learn something about the 
convergence pattern of SU(3) ChPT. Furthermore, one may combine our values for m u d and 
m s with the best available information on another ratio (Q, see below) to obtain a result for the 
individual quark masses m u , nid. 
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13.1 Comparing our value of m s / m u d to the one in ChPT 

As a starting point one might ignore higher-order terms in the chiral expansion and electromag- 
netic corrections all together. Upon identifying the left-hand sides in 

Ml = B (m u + m d ) (71) 

M 2 K± = B {m u + m s ) (72) 

M| = B (m d + m s ) (73) 

M 2 = B (m u + m d + 4m s )/3 (74) 

with the experimentally measured meson massell one obtains three predictions. On the one 
hand, the Gell-Mann-Okubo relation 

3M 2 + Ml ~ 2M| r± + 2M|o (75) 
evaluates to 0.919 GeV 2 ~ 0.983 GeV 2 , which amounts to a 7% accuracy. On the other hand 

{M 2 K± +M 2 K0 )/Ml = (m s + m ud )/(m ud ) (76) 
Ml /Ml = (2m s + m ud )/(3m ud ) (77) 

yield m s /m ud ~ 25.1 and m s /m ud ~ 23.4, respectively. This spread suggests again a precision 
of a few percent. Upon noticing that the 77 undergoes significant mixing with the rf and, as a 
result, that (1761) should be preferred over (1771) , one arrives at the estimates 

m u M 2 K± - M 2 K « + Ml 



m d M 2 R0 -M 2 K± + Ml 
m s _ M 2 K± + M 2 K0 - Ml 



0.66 (78) 
20.8 (79) 



m d M| - M 2 K± + Ml 

which do not take into account electromagnetic contributions to isospin breaking. 

The chiral framework may be extended to include interactions with photons. At leading 
order in a cm and in the 3-flavor chiral limit the electromagnetic contribution to the excess of the 
charged kaon mass squared is the same as for the pion, i.e. [M 2 ± — M 2 ]nm = [Mj c ± — M^ } em , 
known as "Dashen's theorem" ll58l . This leads to the improved relationsj ll59l 

m u Ml ± - Ml + 2M 2 - M 2 ± 

— = , .9 , Z> 4t9 — ^^0.56 (80 

m d M 2 K0 - M 2 K± + M 2 n± 

m s M 2 ± + ML - M 2 ± 

— = - £ , £ ^ 20.2 (81) 

m d M 2 R0 - M 2 K± + M 2 ± 

which account for electromagnetism at leading order (LO) in the chiral expansion. From this 
one obtains m s /m ud = 2/{m d /m s + m u /m d - m d /m s ) ~ 25.9 as the LO result in ChPT. 
Comparing this to our value (f85l) [see below] indicates that - for this quantity - subleading 
contributions yield only about 6% of the total result. 



3 Pseudoscalars without superscript refer to isospin averages: M% = |(M^ ± +M^ ), M\ — ^(M^ ± +M^- ). 
4 The numerical values are based on the latest edition of the PDG ll20ll and differ from those given in ll59ll . 
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13.2 Using dispersive information on Q to split m u d into m u and rrid 



As mentioned in the previous subsection, ChPT is well suited to address the ratios m s /md 
and m u /m d . A way to encode such information on quark mass ratios which, from the ChPT 
viewpoint, is particularly robust is to introduce the double ratio 

Q 2 = m \- m \ d (82) 

since this quantity is unaffected by next-to-leading order (NLO) effects in the chiral expansion. 
Modulo a tiny correction, (f8~2~l) can be put into a form known as "Leutwyler's ellipse" ll59l 



and relying on Dashen's theorem ll58l or refinements thereof (see e.g. [1211 ). one might attempt 
to determine the value of Q from the masses of the charged and neutral kaon and pion. 

Since we intend to use ([821 to predict the isospin splittings in QCD (i.e. without electro- 
magnetism), it seems more advisable to build on the long tradition in the phenomenological 
literature to determine Q from the rate for 77 — ?- 37r decays or from the branching ratio of 
ip' — > ipn° versus if)' — > ipr/ decays. The former amplitude seems particularly interesting, as 
it violates isospin, while being barely affected by electromagnetic corrections ll60l . Evidently, 
this renders it sensitive to the effect of rrid— m u 7^0, which is exactly what we are interested in. 
In the following, we restrict ourselves to the dispersive treatment of the 77—7-3^ amplitude, as 
given by Kambor-Wiesendanger-Wyler 11611 . Anisovich-Leutwyler 11621 . and Colangelo-Lanz- 
Passemar ll63l . In the first place we note that the central value found in these works has been 
remarkably consistent over one and a half decades. Let us also emphasize that a dispersive treat- 
ment is, conceptually, as much from first principles as a lattice computation - dispersion theory 
rests exclusively on the axioms of quantum field theory. In a world with perfect experimental 
data, this would be the complete story. However, with presently available data, additional input 
is required (see e.g. [|63l ). To account for such provisional effects, Leutwyler has assigned his 
estimate Q = 22.3(8) [[641 a much larger error bar than claimed in some of the publications it is 
based on. In our view this is the most accurate value available, if one is not willing to resort to 
model calculations, and we shall thus stay content with its rather conservative error bar. 

We now extend our lattice determinations of m u d and m s /m u d to all three quark masses, 
using this dispersive information. Upon rewriting (I8~2ll8~3l in the form 




(83) 




(84) 



it follows that the above-mentioned value of Q and our lattice result 



27.53(20)(08) 



(85) 



m ud 
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yield the light quark mass asymmetry parameter 



nid — m, 



■u 



0.381(05)(27) 



(86) 



nid + m. 



u 



where the error on Q is considered a systematic error. As an aside we mention that this asym- 
metry parameter is equivalent to m u /m d = 0.448(06)(29). Combining <[8~6l ) with our result 

m ud = 3.503(48) (49) MeV, we obtain 



with all masses given in the RI/MOM scheme at the scale [i = 4 GeV. These values and our 
original results for m s and m u d (along with their counterparts in the RGI and MS schemes) are 
summarized in Tab.[5](see Sec. 15) and quoted in JI). 

To summarize the technical part, let us say that we have determined m u and nid, based on 
our lattice value of m u d, our lattice value of the ratio m s /m u d and the dispersive treatment of 
Q. Given that our simulation points bridge the physical values of m u d and m s (cf. Sec. 5), the 
chiral framework is no longer needed in the first two quantities, and the use of ChPT is thus 
limited to a subdominant contribution in a mostly dispersive framework to determine Q. 

13.3 Physics implication, robustness issues and precision outlook 

Physicswise, an important conclusion is that our result (l86l) for the light quark mass asymmetry 
parameter excludes a vanishing up-quark mass by 22.1 standard deviations. This is a conse- 
quence of the dispersive determination of Q being entirely inconsistent with 13.8, the value of 
Q which relation (|84l) and our result for m s jm ud would enforce if m u = 0. As can be seen from 
(|84|) . the asymmetry parameter depends strongly on the ratio m s /m u d, which is the quantity that 
we have determined to sub-percent precision. The bottom line is that our precise lattice results 
and the dispersive processing of phenomenological information which excludes very large cor- 
rections to Dashen's theorem, when combined, rule out the simplest proposed solution to the 
so-called "strong CP problem". This corroborates previous findings ll59l . 

Note that the way in which we have used phenomenological information is designed to 
make sure that the so-called "Kaplan-Manohar ambiguity" is circumvented in our derivation of 
m u and nid. This ambiguity expresses the fact that a redefinition of the quark condensate and 
of certain low-energy constants allows one to move on Leutwyler's ellipse [|65l . It represents 
an accidental symmetry of those Green's functions in the effective theory which determine 
pseudoscalar masses, scattering amplitudes and matrix elements of the vector and axial-vector 
currents 11591 . However, the aspect ratio of Leutwyler's ellipse is not affected by this ambiguity, 
and it is this shape informational which is encoded in Q. In consequence, relation (|84l ) ensures 

5 We remark that Q as defined in d82l i picks up, under a Kaplan-Manohar transformation, terms of order NNLO 
and a change proportional to m,j—m u . The latter "deficiency" could be cured by defining Q\ = (m 2 s -^m d ) / (m^-m^) 
||66l . Note, however, that the numerical difference between Q and Qi [or Q2, the quantity that shows up in < T84b 1 
is about one permil, i.e. more than an order of magnitude smaller than the uncertainty that we have assigned to Q. 



2.17(04)(10) MeV, m d = 4.84(07)(12) MeV 



(87) 
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that the high precision that we have reached in m s /m ud , together with the robust value of Q that 
we use, leads to a determination of the asymmetry parameter (f86l) and thus of the individual u 
and d quark masses which is unaffected by the Kaplan-Manohar ambiguity. 

We stress that, in our view, there is not much conceptual difference between using only 
M-k.Mk as input quantities versus including Q, too. To compute m u d,m s , we needed two 
(polished) experimental input numbers to adjust the average light and the strange quark masses, 
apart from Mq to set the overall scale (cf. Sec. 4). To compute m u ,m d ,m s , evidently, we 
need a third one, and we are well advised to choose one which is sensitive to the effect we 
want to quantify. We select Q for its large sensitivity to QCD-induced isospin breaking, thus 
requiring very little theoretical polishing, and for this little bit resting on dispersion theory 
which is well founded. Still, there is room for improvement, as can be seen from the fact 
that our value of m ud had 2% precision, while m u and m d have only 5% and 3% accuracy, 
respectively. The problem is that the current value of Q determines the asymmetry parameter 
(|86l ) to only about 7% precision. While improvements on the value of Q obtained in this way 
may be possible If63l , reaching accuracies of m u ,m d below the few percent level will most 
probably require a different approach, even more heavily based on lattice field theory. Indeed, 
once simulations become available with Nf = 1 + 1 + 1 physical quark flavors (i.e. with non- 
degenerate up, down, and strange quark masses, each of which is taken at its physical value) 
and with an additional abelian gauge fielcEl to account for electromagnetic interactions, it will 
become possible to take full advantage of the very accurately known K + and K° masses to 
determine m u and m d with even higher precision. 

14 Assessment of systematic errors 

Our approach is to establish one global fit to interpolate our 11 + 12 + 9 + 9 + 6 = 47 simulations 
at 5 different lattice spacings (cf. Tab.Q} to the physical mass point (i.e. physical M w and M K ) 
and to extrapolate to zero lattice spacing (i.e. a — v 0). In order to obtain a reliable estimate of the 
systematic error involved, we repeat the entire analysis with a large selection of interpolation 
formulae, mass cuts, discretization terms, fit ranges, and renormalization procedures. 

In order to extrapolate or interpolate a given quantity to the physical quark mass point, one 
needs to expand it around some pion and kaon mass point. Often the Nf = 2 or Nf = 3 chiral 
limit is chosen as an expansion point and hence SU(2) or SU(3) ChPT Il3~7ll38l as the theore- 
tical framework. Expressing the dependence on the light quark mass as a dependence on M n , 
this kind of expansion leads, for a quantity which vanishes in the chiral limit, to a quadratic 
term oc M% and higher order chiral logs, e.g. oc log(M^/A 2 ), with known prefactors but 
unknown scale A. In many cases the practical usefulness of knowing the prefactors is limited, 
since they contain other quantities (e.g. the axial coupling for octet baryons) which may 
not be available from the same simulation and which one may not want to borrow from phe- 
nomenology. Furthermore, it is rather difficult for a fit to tell, e.g., a pure M% contribution from 

6 For recent progress in this field see e.g. J67J. 
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an M^log(M^/A 2 ) chiral log. Accordingly, choosing an expansion point for an interpolation 
somewhere in the middle of the region where one has data (or in the middle of the region defined 
by the data points and the target point in case of an extrapolation) and using a simple Taylor 
expansion in Ml leads to rather similar results 0. 

To flesh out the meaning of these statements, let us consider the quantities of interest, m u d 
and m s . In our analysis we use the NLO mass formulae (|45T) from SU(2) ChPT [37], albeit in 
reversed form, so that it expresses m ud as a function of M w . To the order we are working at, this 
can be done in several ways [the difference is an NNLO effect]; we use the relations 

where we have introduced a hadronic quantity 

A = 2M| -Ml - [2M 2 K - Ml} phys (90) 

to parametrize the small deviation of our strange quark mass from its physical value 0. Alter- 
natively, for the light quark mass we use a Taylor expansion of the form 

m ud = ci + c 2 Ml + c 3 M* + c 4 A (9 1 ) 

while the strange quark mass is always parametrized as 

m s = c 5 + c 6 M 2 + c 7 A + c 8 A 2 . (92) 

We have tried to augment these formulas by higher order terms, both in M I and A, but we found 
those coefficients to be consistent with zero, with the given precision of our data. This yields 
3 options for the mass interpolation or extrapolation of the pseudoscalars. Similarly, for the Q 
baryon that serves to set the scale, a Taylor ansatz in Ml and 2Mj c — Ml is used (cf. Sec. 4). In 
total we have 3 functional ansaetze to interpolate our data. 

A standard way to test the functional ansatz is to prune the data with mass cuts. We use 
M w < {380, 480} MeV for the scale setting and M n < {340, 380} MeV for the quark mass 
determination, thus a total of 4 mass cuts. 

A source of error which, in practice, often proves dominant is the contamination of the 
ground state in the two-point correlator by excited states. To reduce this contamination we 
use a Gaussian source and sink with a fixed width of about 0.32 fm. We tested 1-state and 
2-state fits, and found complete agreement if the 1-state fits start at t m i n ~ 0.7 fm for the 
PP, PA4, A^P, A±A± meson channels and from t min ~ 0.8 fm for the Vt. In lattice units this 
amounts to at min = {6,8,9,11,13} for (3 = {3.31,3.5,3.61,3.7,3.8} (and ~ 20% later for 
baryons). In order to estimate any remaining excited state effects, we repeated our analysis 
with an even more conservative meson fit range (starting at at m i n = {7, 9, 11, 13, 15} and again 
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(88) 
(89) 



cen. val. cr stat cx sys t 


plateau scale set fit form mass cut renorm. cont. 


3.503 0.048 0.049 
96.43 1.13 1.47 
27.531 0.196 0.083 


0.330 0.034 0.030 0.157 0.080 0.926 
0.207 0.005 0.031 0.085 0.085 0.970 
0.513 0.200 0.023 0.320 — 0.771 



Table 4: Split-up of the total systematic uncertainty of m^/ s , mf ys and ^ hys / m «/ S (f rom to P 
to bottom) into the various contributions. Entries in columns 1-3 are in MeV and refer to the 
RI/MOM scheme at fx = 4 GeV. Columns 4-10 indicate the relative share of the systematic error 
given in column 3 (the squares of these numbers add up to 1). The headers of these columns 
refer to the plateau range in the primary observables, the overall scale setting, the interpolation 
ansatz to tune to the physical mass point, the cut in the pion mass, the details of the renormal- 
ization procedure (read-off scale, chiral extrapolation), and the continuum extrapolation. 

~ 20% later for baryons). The end of the fit interval was always chosen to be at max = 2.7xat m i n 
or T/2 — 1 for lattices with a time extent shorter than 5.4 x at min . In all cases, the fits were 
performed in a correlated way. In total this gives 2 different fit ranges to make sure that con- 
tamination by excited states is under control. 

As a result of the tree-level value csw = 1 our action has formally O(aa) cut-off effects. 
However, due to the smearing the coefficient in front of this term is small, and the formally sub- 
leading 0(a 2 ) contributions might numerically dominate over the 0(aa) part. To account for 
this we augment our global fit by cutoff terms which stipulate either 0(aa) or 0(a 2 ) deviation 
from the continuum. This ambiguity comes into play in the evolution function (|69~1) and in the 
continuum extrapolation of the quark masses in the RI/MOM scheme, which yields 4 options. 

Besides the variations described above, we consider 3 options in the nonperturbative renor- 
malization procedure (scale \x, massless versus massive intermediate scheme), see Sec. 11. 

All of this serves the goal of quantifying potential systematic effects on our final results. In 
addition, there are standard methods to assess the size of the statistical error. Apart from the 
autocorrelation analysis detailed in Sec. 7, we used different blocking sizes on our ensembles, 
ranging from 1 to 10 configurations, where two adjacent configurations are separated by ten r = 
1 MD updates (cf. Sec. 5). Last but not least, we found that artificial thermalization cuts (where 
we ignore the first 20-100 configurations of the thermalized ensembles) induce no noticeable 
change in our results, and therefore we conclude that possible residual thermalization effects 
are irrelevant for the error analysis. 

Putting everything together we have 3 ansaetze for the interpolation of the quark masses to 
the physical point, 4 mass cuts in the scale setting and the quark mass determination, 2 different 
fit intervals for the primary observables, 4 ansaetze for the continuum extrapolation, and 3 ways 
of doing the RI/MOM renormalization. This gives a total of 3 ■ 4 ■ 2 • 4 • 3 = 288 analyses. 

In order to quote a final result, we follow the procedure used in [0. It is essential to note 
that we have no a priori reason to favor one of these fits over another. Therefore the analysis 
method should represent the full spread of all reasonable and theoretically justified treatments 
of our data. In other words, we use all 288 procedures, and weigh the results by the quality of 
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m s m ud 


m u m d 


RI/M0M(4 GeV) 
RGI 

MS(2 GeV) 


96.4(1.1)(1.5) 3.503(48)(49) 
127.3(1.5X1.9) 4.624(63X64) 
95.5(1.1)(1.5) 3.469(47)(48) 


2.17(04)(10) 4.84(07)(12) 
2.86(05)(13) 6.39(09)(15) 
2.15(03)(10) 4.79(07)(12) 



Table 5: Renormalized quark masses in the RI/MOM scheme at /i = 4 GeV, and after conversion 
to RGI and the MS scheme at /i = 2 GeV. The RI/MOM values are fully nonperturbative, so the 
first line is our main result. The first two columns emerge directly from our lattice calculation, 
the last two build, in addition, on dispersive information on Q. The precision of m s and m ud 
is somewhat below the 2% level, for m u and m d it is about 5% and 3%, respectively. The ratio 
ms/mud = 27.53(20)(08) is independent of the scheme and accurate to better than 1%. 

fit Q = T(n/2, x 2 /2) to form a histogram. Next, we compute the mean and standard deviation 
of the distribution, and this yields the central value and the systematic error which we quote. 
Finally, we repeat this extensive procedure on 2000 bootstrap samples. The standard bootstrap 
error of the mean gives the statistical error. 

An additional benefit of our method to treat systematic effects is that we can temporarily 
suppress one of the variations considered (i.e. abandon one of the factors who's product leads 
to the 288 procedures) to learn about the contribution of this individual factor to the total error. 
The total "error budget" compiled in this way is shown in Tab.Hl Evidently, it exhibits the 
cut-off effects as the dominant source of systematic uncertainty in our results. 

All together, our procedure to assess both statistical and systematic errors is an extended 
frequentist method 11201 which was already used in ||2). 

15 Summary 

We have carried out a precise determination of the average light quark mass m ud = (m u + 
m d )/2 and of the strange quark mass m s , using nonperturbative Nf = 2 + 1 lattice QCD and 
nonperturbative renormalization throughout. Our data cover 5 lattice spacings in the range 
0.054 — 0.116 fm, with pion masses down to ~ 120 fm and box sizes up to 6 fm. This allows for 
a safe extrapolation to the continuum (a — > 0) and to infinite volume (L — > oo). 

We have devised a number of innovative methods, most notably a scheme to exploit the 
different renormalization pattern of Wilson and PCAC quark masses with tree-level 0(a)- 
improved clover quarks and a procedure to overcome the RI/MOM window problem by taking 
a separate continuum limit of the running of the scalar density Rs(f<>, //)■ 

Our main result, m s and m ud in the RI/MOM scheme at renormalization scale ft = 4 GeV 
(cf. Tab.[5]), is from first principles and fully nonperturbative. To ease comparison with the 
literature, these values are converted to RGI conventions and, subsequently, to the MS scheme. 
In this step reference to perturbation theory is unavoidable, but we do this in a controlled way, 
since we show that the 4-loop anomalous dimension of the scalar density is consistent with our 
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Figure 14: Continuum extrapolation of m u d (top), m s (middle), m s /m ud (bottom) versus aa, 
for one of our 288 analyses with a good fit quality (cf discussion in Sec. 14). 



nonperturbative running for /x>4 GeV. The ratio m s /m u <i is scheme and scale invariant. It turns 
out that our action entails favorable scaling properties not just for hadron masses, but also for 
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renormalized quark masses, as the plot of a representative continuum extrapolation in Fig. [14] 
shows. The combination of using pion masses down to (and even below) the physical value and 
having precise and fully nonperturbative renormalization factors allows us to determine m s and 
m u d with a precision of better than 2%, and their ratio to better than 1%. 

A determination of the individual light quark masses m u and by lattice methods alone 
is beyond the scope of this paper. Nevertheless, the precision of our values of m u d and m s /m ud 
allows for a fruitful use of the result of the dispersive analysis of the double ratio Q (cf. discus- 
sion in Sec. 13). By combining these pieces of information, we obtain values of the individual 
quark masses m u and with a precision of 5% and 3%, respectively (cf. Tab.[5]). 
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